Quadratic Time Algorithms for Finding Common Intervals in Two and More Sequences
نویسندگان
چکیده
A popular approach in comparative genomics is to locate groups or clusters of orthologous genes in multiple genomes and to postulate functional association between the genes contained in such clusters. To this end, genomes are often represented as permutations of their genes, and common intervals, i.e. intervals containing the same set of genes, are interpreted as gene clusters. A disadvantage of modelling genomes as permutations is that paralogous copies of the same gene inside one genome can not be modelled. In this paper we consider a slightly modified model that allows paralogs, simply by representing genomes as sequences rather than permutations of genes. We define common intervals based on this model, and we present a simple algorithm that finds all common intervals of two sequences in Θ(n) time using Θ(n) space. Another, more complicated algorithm runs in O(n) time and uses only linear space. We also show how to extend the simple algorithm to more than two genomes, and we present results from the application of our algorithms to real data.
منابع مشابه
Finding Nested Common Intervals Efficiently
In this article, we study the problem of efficiently finding gene clusters formalized by nested common intervals between two genomes represented either as permutations or as sequences. Considering permutations, we give several algorithms whose running time depends on the size of the actual output rather than the output in the worst case. Indeed, we first provide a straightforward cubic time alg...
متن کاملA mathematically simple method based on denition for computing eigenvalues, generalized eigenvalues and quadratic eigenvalues of matrices
In this paper, a fundamentally new method, based on the denition, is introduced for numerical computation of eigenvalues, generalized eigenvalues and quadratic eigenvalues of matrices. Some examples are provided to show the accuracy and reliability of the proposed method. It is shown that the proposed method gives other sequences than that of existing methods but they still are convergent to th...
متن کاملQuadratic-time Algorithm for the String Constrained LCS Problem
The problem of finding a longest common subsequence of two main sequences with some constraint that must be a substring of the result (STR-IC-LCS) was formulated recently. It is a variant of the constrained longest common subsequence problem. As the known algorithms for the STR-IC-LCS problem are cubic-time, the presented quadratic-time algorithm is significantly faster.
متن کاملCommon Zero Points of Two Finite Families of Maximal Monotone Operators via Proximal Point Algorithms
In this work, it is presented iterative schemes for achieving to common points of the solutions set of the system of generalized mixed equilibrium problems, solutions set of the variational inequality for an inverse-strongly monotone operator, common fixed points set of two infinite sequences of relatively nonexpansive mappings and common zero points set of two finite sequences of maximal monot...
متن کاملEfficient Algorithms for Just-In-Time Scheduling on a Batch Processing Machine
Just-in-time scheduling problem on a single batch processing machine is investigated in this research. Batch processing machines can process more than one job simultaneously and are widely used in semi-conductor industries. Due to the requirements of just-in-time strategy, minimization of total earliness and tardiness penalties is considered as the criterion. It is an acceptable criterion for b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004